Search CORE

5 research outputs found

Podify : a podcast streaming platform with automatic logging of user behaviour for academic research

Author: Meggetto Francesco
Moshfeghi Yashar
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 27/07/2023
Field of study

Podcasts are spoken documents that, in recent years, have gained widespread popularity. Despite the growing research interest in this domain, conducting user studies remains challenging due to the lack of datasets that include user behaviour. In particular, there is a need for a podcast streaming platform that reduces the overhead of conducting user studies. To address these issues, in this work, we present Podify. It is the first web-based platform for podcast streaming and consumption specifically designed for research. The platform highly resembles existing streaming systems to provide users with a high level of familiarity on both desktop and mobile. A catalogue of podcast episodes can be easily created via RSS feeds. The platform also offers Elasticsearch-based indexing and search that is highly customisable, allowing research and experimentation in podcast search. Users can manually curate playlists of podcast episodes for consumption. With mechanisms to collect explicit feedback from users (i.e., liking and disliking behaviour), Podify also automatically collects implicit feedback (i.e., all user interactions). Users' behaviour can be easily exported to a readable format for subsequent experimental analysis. A demonstration of the platform is available at https://youtu.be/k9Z5w_KKHr8, with the code and documentation available at https://github.com/NeuraSearch/Podify

University of Strathclyde Institutional Repository

On Building a Podcast Collection with User Interactions

Author: Jones Rosie
Meggetto Francesco
Moshfeghi Yashar
Publication venue: University of Strathclyde
Publication date: 01/09/2021
Field of study

The podcast is a growing listening medium that has surged in popularity in recent years. Despite the great research opportunities, it has only attracted limited attention from the community so far. This is mainly due to the lack of available data collections that have considerably restricted research in academia. To facilitate it, in 2020, the Spotify Podcast Dataset was released, a corpus of 100k episodes with associated text transcript and metadata. However, no user interactions are available, hence making its usability challenging for certain domains, such as recommendation, personalisation, and user behaviour and consumption analysis. In this position paper, we present various approaches to augment such collection with user interactions, together with their respective strengths and weaknesses. If developed further, this work has the potential of a broader impact on the research community

University of Strathclyde Institutional Repository

Why people skip music? On predicting music skips using deep reinforcement learning

Author: Levine John
Meggetto Francesco
Moshfeghi Yashar
Revie Crawford
Publication venue: arXiv
Publication date: 10/01/2023
Field of study

Music recommender systems are an integral part of our daily life. Recent research has seen a significant effort around black-box recommender based approaches such as Deep Reinforcement Learning (DRL). These advances have led, together with the increasing concerns around users' data collection and privacy, to a strong interest in building responsible recommender systems. A key element of a successful music recommender system is modelling how users interact with streamed content. By first understanding these interactions, insights can be drawn to enable the construction of more transparent and responsible systems. An example of these interactions is skipping behaviour, a signal that can measure users' satisfaction, dissatisfaction, or lack of interest. In this paper, we study the utility of users' historical data for the task of sequentially predicting users' skipping behaviour. To this end, we adapt DRL for this classification task, followed by a post-hoc explainability (SHAP) and ablation analysis of the input state representation. Experimental results from a real-world music streaming dataset (Spotify) demonstrate the effectiveness of our approach in this task by outperforming state-of-the-art models. A comprehensive analysis of our approach and of users' historical data reveals a temporal data leakage problem in the dataset. Our findings indicate that, overall, users' behaviour features are the most discriminative in how our proposed DRL model predicts music skips. Content and contextual features have a lesser effect. This suggests that a limited amount of user data should be collected and leveraged to predict skipping behaviour

University of Strathclyde Institutional Repository

Why people skip music? On predicting music skips using deep reinforcement learning

Author: Levine John
Meggetto Francesco
Moshfeghi Yashar
Revie Crawford
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 10/01/2023
Field of study

Music recommender systems are an integral part of our daily life. Recent research has seen a significant effort around black-box recommender based approaches such as Deep Reinforcement Learning (DRL). These advances have led, together with the increasing concerns around users' data collection and privacy, to a strong interest in building responsible recommender systems. A key element of a successful music recommender system is modelling how users interact with streamed content. By first understanding these interactions, insights can be drawn to enable the construction of more transparent and responsible systems. An example of these interactions is skipping behaviour, a signal that can measure users’ satisfaction, dissatisfaction, or lack of interest. In this paper, we study the utility of users' historical data for the task of sequentially predicting users' skipping behaviour. To this end, we adapt DRL for this classification task, followed by a post-hoc explainability (SHAP) and ablation analysis of the input state representation. Experimental results from a real-world music streaming dataset (Spotify) demonstrate the effectiveness of our approach in this task by outperforming state-of-the-art models. A comprehensive analysis of our approach and of users’ historical data reveals a temporal data leakage problem in the dataset. Our findings indicate that, overall, users' behaviour features are the most discriminative in how our proposed DRL model predicts music skips. Content and contextual features have a lesser effect. This suggests that a limited amount of user data should be collected and leveraged to predict skipping behaviour

University of Strathclyde Institutional Repository

On skipping behaviour types in music streaming sessions

Author: Levine John
Meggetto Francesco
Moshfeghi Yashar
Revie Crawford
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 26/10/2021
Field of study

The ability to skip songs is a core feature in modern online streaming services. Its introduction has led to a new music listening paradigm and has changed the way users interact with the underlying services. Thus, understanding their skipping activity during listening sessions has acquired considerable importance. This is because such implicit feedback signal can be considered a measure of users' satisfaction (dissatisfaction or lack of interest), affecting their engagement with the platforms. Prior work has mainly focused on analysing the skipping activity at an individual song level. In this work, we investigate different behaviours during entire listening sessions with regards to the users' session-based skipping activity. To this end, we propose a data transformation and clustering-based approach to identify and categorise skipping types. Experimental results on the real-world music streaming dataset (Spotify) indicate four main types of session skipping behaviour. A subsequent analysis of short, medium, and long listening sessions demonstrate that these session skipping types are consistent across sessions of varying length. Furthermore, we discuss their distributional differences under various listening context information, i.e. day types (i.e. weekday and weekend), times of the day, and playlist types

University of Strathclyde Institutional Repository